Application of Additive Groves Ensemble with Multiple Counts Feature Evaluation to KDD Cup'09 Small Data Set
نویسنده
چکیده
This paper describes a field trial for a recently developed ensemble called Additive Groves on KDD Cup’09 competition. Additive Groves were applied to three tasks provided at the competition using the ”small” data set. On one of the three tasks, appetency, we achieved the best result among participants who similarly worked with the small dataset only. Postcompetition analysis showed that less successfull result on another task, churn, was partially due to insufficient preprocessing of nominal attributes. Code for Additive Groves is publicly available as a part of TreeExtra package. Another part of this package provides an important preprocessing technique also used for this competition entry, feature evaluation through bagging with multiple counts.
منابع مشابه
Rating Prediction with Informative Ensemble of Multi-Resolution Dynamic Models
The Yahoo! music rating data set in KDD Cup 2011 raises several interesting challenges: (1) The data covers a lengthy time period of more than eight years. (2) Not only are training ratings associated date and time information, so are the test ratings. (3) The items form a hierarchy consisting of four types of items: genres, artists, albums and tracks. To capture the rich temporal dynamics with...
متن کاملFeature Engineering and Ensemble Modeling for Paper Acceptance Rank Prediction
Measuring research impact and ranking academic achievement are important and challenging problems. Having an objective picture of research institution is particularly valuable for students, parents and funding agencies, and also attracts attention from government and industry. KDD Cup 2016 proposes the paper acceptance rank prediction task, in which the participants are asked to rank the import...
متن کاملCombining Factorization Model and Additive Forest for Collaborative Followee Recommendation
Social networks have become more and more popular in recent years. This popularity creates a need for personalization services to recommend tweets, posts (information) and celebrities organizations (information sources) to users according to their potential interest. Tencent Weibo (microblog) data in KDD Cup 2012 brings one such challenge to the researchers in the knowledge discovery and data m...
متن کاملAnomaly Detection Using SVM as Classifier and Decision Tree for Optimizing Feature Vectors
Abstract- With the advancement and development of computer network technologies, the way for intruders has become smoother; therefore, to detect threats and attacks, the importance of intrusion detection systems (IDS) as one of the key elements of security is increasing. One of the challenges of intrusion detection systems is managing of the large amount of network traffic features. Removing un...
متن کاملWinning the KDD Cup Orange Challenge with Ensemble Selection
We describe our wining solution for the KDD Cup Orange Challenge.
متن کامل